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Assessment of a Beta Prior Distribution : PIVi Elicitation 



KATHRYN M. CHALONER and GEORGE T. DUNCANf 

Department of Statistics and School of Urban and Public Affairs, 
Carnegie-Mellon University, Pittsburgh, PA 15213 



Abstract: An interactive computer scheme is described for eliciting from an analyst a beta prior 
distribution on the parameter tc of a binomial distribution. Information on the analyst's beta- 
binomial predictive distribution is obtained through a questioning and feedback algorithm based 
on modes. 



1 Introduction 

Practical implementation of subjective Bayesian methods requires the assessment of a prior 
distribution. This prior distribution is personal to the substantive expert, the analyst, who 
has primary concern for the results of the statistical analysis. The role of the statistician is to 
support the statistical analysis by providing a body of generally applicable technique and, 
in particular, in providing elicitation procedures for the personal prior distribution of the 
analyst. 

Interactive statistical computing provides the opportunity for new and workable methods 
of prior distribution assessment. Kadane et al. (1980) demonstrated a method for eliciting 
a natural conjugate prior for the normal regression model. This article proposes interactive 
elicitation methodology for a natural conjugate beta prior distribution for a binomial 
parameter tt. 

We base our elicitation method on the persuasive arguments of Geisser (1980) and 
Kadane (1980) that the natural elements for statistical inference are characteristics of 
the predictive distribution of an analyst. That is, our concern is with the distribution of an 
observable quantity X, unconditional on any unobservable parameters tt. This distribution 
differs from the sampling distribution of X, which is the distribution of X conditional on 
the value of the parameter tt. Indeed, the predictive distribution function of Zat any parti- 
cular value X has a representation as a probability-weighted average over tt of the sampling 
distribution functions. The probability weights are given by the prior distribution. An 
important fact is that specification of certain characteristics of the predictive distribution 
determines the prior distribution. This fact allows elicitation of a prior distribution to be 
carried out through elicitation of a predictive distribution, which is presumably a cognitively 
easier task since only potentially observable quantities are involved. 
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Perhaps the first application of these ideas was by Bayes (1764). As interpreted by Stigler 
(1982), Bayes considered situations in which for any n Bernoulli trials, independent condi- 
tionally on the parameter tt, the predictive distribution was uniform. That is, P(X=x)= 
l/(w+l)forx=0, 1,. . Bayes took this uniform predictive distribution for the observable 
X as indicative of lack of knowledge about the unobservable tt. 

Two comments with regard to Bayes's position, essentially made by Stigler (1982), set the 
stage for our elicitation scheme. First, the role of n is presumably arbitrary. No value of n 
is of special merit, so the uniformity of the distribution of ^ would hold consistently over n. 
Second, knowledge about the data generating process is presumably reflected in a non- 
uniform distribution for X, one in which a particular event [Z=;c*], say, has higher 
probability than some other event. 

Central to our elicitation scheme is (1) over-determination of the parameters of the prior 
distribution and their reconciliation to give coherency over n and (2) identification of an 
observable event of highest probability and an assessment of the extent to which the 
predictive distribution for X departs from uniformity. To give a specific example, consider 
predictive distributions in the beta-binomial class with (x=3 and j8=2 for «=4 and n=6. 
The two predictive distributions are displayed in Figure 1. Given a certain state of knowledge 
by the analyst of the nature of the data generating process as specified by a =3 and P=2, 
coherency requires that if the predictive distribution is as at the top of Figure 1 for «=4, 
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Fig. 1. Beta-binomial density functions for a=3, i8=2, ii=4 and n=6. 



then it must be as at the bottom of Figure 1 for « = 6. In actual elicitation these predictive 
distributions will differ. Since a reasonable theory of decisionmaking argues that a person 
ought to be coherent, variation in the assessment of a and over different values of n can 
be regarded as noise. It is desirable to reconcile the disparities to achieve coherency. 

An analyst assessing his or her knowledge about a data generating process naturally tends 
to anchor (see Tversky and Kahneman, 1981) on the uniform distribution and begin to 
establish departures from uniformity. Arguably, a first step is to establish which outcomes of 
X are most likely, that is to specify modes of the predictive distribution of X as points of 
departure from uniformity. The.first question posed according to our method elicits a mode 
of the predictive distribution. Hence we call the method PM elicitation, for predictive modal 
elicitation. A refinement is the second set of questions establishing the relative likelihood of 
the modes to adjacent values of X, thereby assessing the extent of departure from uniformity. 
Elicitation methods, such as PM elicitation which focus first on modes of X and then on the 
relative likelihood of the modes to adjacent values of X would seem to be in close accord 
with what is known empirically about how individuals assess uncertainty. 
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On the other hand, other methods for elicitation which have intuitive appeal would begin 
with elicitation of means or medians of the predictive distribution. For the analyst to assess 
the mean or median of the distribution of X requires, in the framework we have argued 
reasonably holds, additional cognitive processing. 

In the case of the mean, the extent of this processing is so great that elicitation methods 
which demand it may well produce little more than random noise. To convince oneself of 
this, one can stare at the two predictive distributions displayed in Figure 1 and attempt to 
assess the mean without doing any arithmetic (off-limits also is conjuring up the formula 
for a beta-binomial mean in terms of n, a and Clearly in the case of skew distributions 
the mean is heavily influenced by small probabilities on extreme values of X Thus the mean 
has little utility as a base for subjective elicitation in this context. 

The median of X is not so bad as the mean of Z as a base for subjective elicitation. It is 
certainly not unnatural for an analyst holding a certain predictive distribution for Xto assess 
a value Xmedian for which even odds bets that A" will be above or below ^median are equally 
attractive. But specification of ^median has no special advantages in analytically deter- 
mining the prior parameters a and ^, and eliciting it is not cognitively less demanding than 
eliciting a mode of X, Other quantiles than the median might also be used but, as with the 
mean, require further cognitive processing. Also extreme quantiles require knowledge of the 
tails of the predictive distribution which are difficult to quantify. Events of small probability 
are notoriously hard to specify (Savage, 1971). 



2 The PM elicitation algorithm 

The PM elicitation algorithm has the analyst choose, for each of various numbers n of trials, 
a modal number m of successes. For the basic case under consideration of a beta prior 
distribution which is single peaked the parameters a and j8 are each greater than one. In 
this case the number of trials n must be large enough for the count of successes and the 
count of failures at the specified mode to be at least one. If the analyst believes that, however 
large n is, either count is zero, then different methods are required. 

Specification of a mode of X places some simple analytical constraints on the prior 
parameters a and j3. If m is a mode of a beta-binomial distribution, then the ratio of the 
probability at m to the probability at the adjacent points m — 1 and m + 1 must be no greater 
than one. The form of the discrete density function for X is 

f(r'. R\~ r(/2+i)r(cx+^)r(a+x)r(^+^^x) . i 
jKx, a, pj- p(^^^^^) p(^_^ 1) r(«-x+ 1) r(a) T{p) ' ^' • • " 

Thus, if we denote the ratios of the probability at m - 1 and m + 1 to the probability at 
m by di and dr, respectively, 

// {n-m){m-\-oL) . 

f(m) (w+l)(«-m + i8-l)^ 

and 

_ f(m+l) _ m{n-m + P) . 
Km) ~(«-m+l)(m + a-l)"' 

Once the mode is specified, these two inequalities constrain a and ^ to lie within a parti- 
cular cone in the (a, j8)-plane. This cone is displayed with soHd lines in Figure 2. If, in 
addition, the two ratios, di and dr, of the probability at the mode to the probability at the 
adjacent values are specified, then a and jS are determined as the intersection of the two 
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Fig. 2. Geometry of PM elicitation. 



dotted lines in Figure 2. The analyst is then given feedback in the form of prediction intervals 
and allowed to modify the spread of the predictive distribution while leaving the mode fixed. 
This interactive process is the heart of PM elicitation. 

We now describe the details of PM elicitation as it is implemented in an interactive 
Fortran program using the DEC Gigi colour graphics terniinal. The first step is the specifi- 
cation by the analyst of a value of n, the number of trials to be considered. For many 
applications, n = 20 appears to be suitable at this first step. Next, the analyst is asked for 
the value of m, the most likely number of successes in the n trials. This gives the mode of the 
analyst's predictive distribution. The terminal then displays as feedback to the analyst, a 
bar chart of binomial distribution probabilities having the same mode and number of trials. 
Specifically, probabilities for a binomial (n^ mjn) distribution are displayed. This feedback 
gives the analyst an indication of what the analyst's predictive distribution would be if the 
analyst knew, without uncertainty, the probability 77 of success. The analyst then has an 
anchor or benchmark for the introduction of uncertainty. Specifically, the values of di and 
dr for this binomial distribution, which are calculated and displayed, suggest a lower bound 
for di and dr for the analyst's beta-binomial predictive distribution. 

At the next stage of the elicitation process, the analyst provides the numerical values of 
di and dr. Each numerical value determines a straight line relationship between a and ^. 
To ensure that these two lines, shown as the dotted lines in Figure 2, intersect within the 
mode cone, the following inequality must be satisfied: 

m{n — m) 



didr^ 



(m+ 1) (« — m+ 1) 



If the analyst gives values violating this inequality, feedback indicates the violation and the 
analyst is asked to respecify di and dr. Once values satisfying the inequality have been given, 
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the program calculates the values ai and j8i corresponding to the values of di and dr. These 
values oLi and jSi are interpreted as initial "estimates" of the parameters and ^ of the beta 
prior distribution. They also represent initial changes from the uniform distribution case 
of(ao, iBo)=(l, 1). 

At this stage of the algorithm, we take the mode, (a— l)/(a+^— 2), of the beta prior 
distribution to have been elicited. The spread, however, may not yet have been adequately 
elicited. The program now keeps the mode of the beta distribution fixed and enters an 
interactive stage to elicit the proper spread. Note that fixing the mode of the beta distribution 
fixes the mode of the beta-binomial distribution, whereas fixing the mean of the beta 
distribution does not fix the mode of the beta-binomial distribution. 

The program takes ai and and provides graphical feedback illustrating the smallest 
prediction interval of the beta-binomial distribution with at least 50 per cent probability 
content. Also calculated and displayed are the exact predictive probabilities in this interval 
and their sum, the probability content of the interval. The analyst is then asked if this 
predictive interval is (a) too short, (b) too long, or (c) the right length. If the analyst answers 
(c), the interval is consistent with the analyst's beliefs about the appropriate length, the 
elicitation for this value of n is complete. He is then presented with the values of oti and ^i. 

If the 50 per cent prediction interval is too short, the analyst is given a "longer" pre- 
diction interval calculated by decreasing the values of ai and pi to a2 and The 50 per 
cent prediction interval implied by the new values olz and ^2 may not literally be longer, but 
it will have less total probability on the points in the earlier prediction interval. On the other 
hand, if the 50 per cent prediction interval is too long, the analyst is given a "shorter" 
prediction interval calculated by increasing the values of ai and jSi to ol2 and j82. The analyst 
is then given the 50 per cent prediction interval implied by and p2 and asked again 
whether it is of the right length. The process iterates until the analyst is satisfied with the 
length of the interval. The analyst is then presented with the values of oc and j8 at the final 
step. 

In this stage of adjusting the length of the prediction interval, the program chooses new 
values of a: and j8 according to a process similar to binary search. Specifically, all choices of 
OLi and pi are restricted to the ray from (1, 1) through (ai, This restriction keeps the 
mode of the elicited beta prior distribution constant. If the analyst initially says the pre- 
diction interval is too long, the values ai— 1 and jSi— 1 are doubled to obtain a^—l and 
^2 — 1, This doubling of each successive pair is continued until the analyst says the pre- 
diction interval is too short. This gives an upper bound on a and jS. Let the first time the 
analyst says the interval is too short be denoted by /. Then (a, P) is restricted to lie within 
the interval joining (a;-i, to (a;, Pi), The value of (a^+i, is given by cxi+i = 
^{(xi^-oLi-i) and pi-\-i=\{Pi+pi-i). This process is continued as long as the analyst is dis- 
satisfied with the length of the prediction interval. Specifically, at the (/+7)th iteration, if 
the analyst says the interval is too short, then ai+_;+i=(x;+;— 2-^~i(a^— a;_i) and Pi^jj^i= 
pi^j--2-^-^(Pi-Pi-i). Similarly, if at this stage the analyst says the interval is too long then 
ai^j^x = ai^j+2-^-'^{(xi'-ai-i) and Pi^j+i = Pi+f+2-^-^(Pi- Pi-i). Thus the iterations will 
give (a, ^)-values which converge to finite values. The analyst has specified a particular 
beta-binomial distribution as the analyst's predictive distribution. If the analyst never 
specifies the interval is too short the analyst is indicating certainty about the value of tt so 
that the binomial distribution adequately represents the analyst's predictive distribution. 

At this point the analyst may end the elicitation process and use the beta prior distribution 
of w which has been determined. Or, as seems preferable, the analyst may repeat the PM 
elicitation scheme using a different value of «, the number of trials. If the analyst is coherent, 
in the full sense with no elicitation errors, the elicited values of (a, p) for the different values 
of n will all be the same. In practice, these values will diflfer and the analyst will have several 
"estimates", say, (a^, p^), (a^, p^), . . ., (a*, p^) one for each of the k values of n used. There 
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are three evident approaches to reconciling these values. One approach would be to ask the 
analyst to reconcile them in any way the analyst may choose, which might result in the 
selection of one the analyst finds most appealing, for whatever reason. A second approach 
would be to mechanically just choose some center, say the mean or median, of a* and 
/= 1, . . ., k. Either of these two approaches seems perfectly adequate when the values of 
(a^ are not too disparate. A third approach, is one in which the statistician (or the 
analyst) uses the elicited values as data and estimates in a Bayesian fashion the analyst's 
postulated underlying values of a and jS. This formal Bayes approach would require the 
assumption that the elicitation method prompted responses with errors associated with 
them, the specification of a prior distribution for a and j8 and, perhaps less easily, the joint 
sampling distribution of the elicited values. As one aspect of the difficulty of the latter task, 
we note that the elicited values are presumably not independent, even conditionally on 
OL and jS. 

In any case the overspecification of estimates of ol and j8 seems to be an essential ingredient 
of a good elicitation scheme. In particular, it allows the statistician and the analyst to 
examine aspects of the analyst's elicited responses which have led to incoherency. It also 
permits a check on whether the underlying beta-binomial model is adequate to represent the 
analyst's beliefs. 



3 Discussion 

There has been some discussion in the statistical literature of how to elicit a beta prior 
distribution; see, for example, Bunn (1978, 1979). Proposed methods have concentrated on 
asking directly for the mean and variance of the beta prior distribution or have 
asked for the mean or variance of the posterior distribution after imaginary future results. 
We suggest that PM elicitation is an improvement for several reasons. 

First, the method utilizes the current facilities of a computer, providing the interactive 
capability of structured questions, instant feedback, and graphical feedback. The instant 
graphical feedback appears in practice to be an especially helpful aspect of PM elicitation. 
Second, PM elicitation deals with the predictive distribution, the distribution of observable 
quantities, which we regard as more basic than the underlying prior distribution. Third, in 
the particular context of the beta-binomial distribution PM elicitation is appealing in that 
it deals with modes rather than with means or medians. Modes appear to be easier to process 
cognitively. 

The beta-binomial distribution is a special case of the Dirichlet-multinomial distribution. 
We have expanded and adapted PM elicitation to this case (Chaloner and Duncan, 1982). 
A unimodality property of Dirichlet-multinomial distributions is proved and utilized in this 
paper. The, coherency of analysts over different values of n has also been investigated 
empirically and we will report findings in a later paper. 

In conclusion, we note that the major strength of Bayesian inference is that it optimally 
combines the expertise of the analyst with the information of the data. In effectively using 
this strength of introducing the expertise of the analyst, practical Bayesian statistics requires 
good methods for the elicitation of expert opinion. We view PM ehcitation as a start on this 
task for inference regarding count data. 



References 

Bayes, Thomas (1764) An essay towards solving a problem in the doctrine of chances. Philosophical Trans- 
actions of the Royal Society of London, 53, 370-418. Reprinted in Barnard (1958). Biometrika, 45, 293-315. 
Bunn, D. W. (1978). The estimation of a Dirichlet prior density. Omega, 6 (4), 371-3. 

179 



This content downloaded from 205.133.226.104 on Sat, 22 Nov 2014 12:59:09 PM 
All use subject to JSTOR Terms and Conditions 



Bunn, D. W. (1979) Estimation of subjective probability distributions in forecasting and decision making. 

Technological Forecasting and Social Change^ 14 (1), 205-16. 
Chaloner, K. M. and Duncan, G. T. (1982). Interactive Elicitation of a Dirichlet-multinomial Distribution in 

the Spirit of Bayes. Technical Report 208, Carnegie-Mellon University. 
Geisser, S. (1980). The Estimation of Distribution Functions and the Prediction of Future Observations, |)p. 

193-208. Academia Sinica, Taiwan. 
Kadane, J. B. (1980). Predictive and structural methods for eliciting prior distributions. In Bayesian Analysis 

in Econometrics and Statistics, North-Holland, Amsterdam. 
Kadane, J. B., Dickey, J. M., Winkler, R. L. Smith, W. S. and Peters, S. C. (1980). Interactive elicitation of 

opinion for a normal linear model. Journal of the American Statistical Association, 75, 845-54. 
Savage, L. J. (1971). Elicitation of personal probabilities and expectations. /(Owr/i^/ of the American Statistical 

Association, 66, 565-614. 

Stigler, Stephen M. Thomas Bayes's Bayesian inference. Manuscript. Department of Statistics, University 
of Chicago. 

Tversky, Amos and Kahneman, Daniel (1981). The framing of decisions and the psychology of choice. 
Sx^ience, 211 (30 January), 453-8. 



180 



This content downloaded from 205.133.226.104 on Sat, 22 Nov 2014 12:59:09 PM 
All use subject to JSTOR Terms and Conditions 



